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Note: All materials for this case and the supporting teaching note were finalized prior to FireEye’s public announcement on 08 December 
2020 that “A highly sophisticated state-sponsored adversary stole FireEye Red Team tools”, as reported in their public blog post: 
https://www.fireeye.com/blog/threat-research/2020/12/unauthorized-access-of-fireeye-red-team-tools.html . We leave it to the 
imagination of case instructors and users to figure out how to work this important news update into the case. 


When this case was first written in mid-2020, the company was called FireEye. On 05 October 2021, the FireEye hardware products 
business was acquired by a private equity company, and the remaining services and software portions of the former company were 
relaunched under the new corporate name of Mandiant. Everything described in this case is now part of Mandiant. 


CYBERSECURITY AT FIREEYE: HUMAN + Al 


July 2020. Steve Ledzian, Chief Technology Officer of Fire Eye Asia Pacific, sat in his favourite 
coffee shop in Singapore and watched the shop manager discussing with a ‘handyman’ about a pair 
of locks installed at the main door. Physical threats of various forms were something for which human 
beings had prepared themselves for centuries. Ledzian wondered if the coffee shop manager ever 
worried about cyber threats as much, as he scrolled through the day's notices that would go out as 
part of FireEye's Victim Notification program. These were organisations across the world that had 
been hacked one way or another but didn't know it yet. 


With the proliferation of the Internet and the blurring lines between the cyber and the physical worlds, 
cyber threats had become an unavoidable reality. Global expenditure on cybersecurity was estimated 
to be US$43.1 billion as at July 2020.! Cybersecurity was a high stakes game, and firms like FireEye 
had been increasingly making use of Artificial Intelligence (AI) tools to automate some aspects of 
cybersecurity monitoring and protection. AI tools had allowed the firm to cope with the proliferation 
of cyber threats more efficiently and effectively by supporting and supplementing human expertise. 


FireEye had organised its cybersecurity solutions in a hub-and-spoke model designed to integrate 
machine-generated threat data from its detection and prevention products, with its analytics, response 
expertise and orchestration technologies delivered through a cloud-based cybersecurity operations 
platform. Armed with a moderately-sized team, FireEye had been able to drive its revenue from 
US$11.8 million in 2010 to US$889.2 million in 2019, relying on AI-based tools to execute tedious, 
repetitive tasks.? However, future growth and success depended, in part, on the firm’s ability to 
expand its platform and grow its business in response to changing technologies, customer demands 
and competitive pressures. 


The firm had implemented machine learning techniques to reduce the time to discover and distribute 
threat intelligence, as well as generate efficiencies across its product and services offerings. It had 
applied AI solutions to baseline “normal” behaviour to create alerts when anomalies and deviations 
occurred. However, implementing such solutions required benchmarking and validation of the 
solutions, and in turn, refining and training of the algorithms based on the findings. It also needed a 
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mind-set shift in which analysts could start trusting a model and use a model’s findings in their 
analysis. AI solutions involved an iterative testing and retraining process, and benefits of its 
implementation were sometimes not immediately visible. 


Given the long gestation period of AI solutions, Ledzian wondered how FireEye could deliver its 
expertise seamlessly with the help of AI tools, arming human experts with the exact information they 
needed, when they needed it most. He further pondered - was the Human + AI approach the right 
strategy for FireEye? 


FireEye 


Established in 2004, FireEye was a publicly-traded cybersecurity company headquartered in 
California with offices across North America, Europe, Asia Pacific, Middle East and Africa. The 
company specialised in providing software, hardware and support services in the cybersecurity field. 
It assisted organisations in implementing network security solutions, protecting their networks 
against malicious software and investigating cybersecurity attacks. It also provided its clients with a 
single platform that blended innovative security technology with threat intelligence. 


The FireEye product portfolio included a network security product, email security and an endpoint 
security product. The firm also provided a platform called Helix for managing security operations. 
Besides, it offered core consulting services on incident response and threat intelligence to client 
companies. The core services of the company were provided under the brand name Mandiant. 


Around 69% of FireEye clients were based in the US, and a large percentage of the clients belonged 
to the Computer Software and IT services industry. Of all the companies that used FireEye products 
and services, 19% were small enterprises (<50 employees), 40% were medium-sized, and 41% were 
large companies (>1000 employees).*? The company had more than 9,000 customers across 103 
countries and provided protection from cyber threats to more than 1,000 government and law 
enforcement agencies worldwide. 


In March 2020, FireEye won a very high profile competition sponsored by the US Navy, called the 
Artificial Intelligence Applications to Autonomous Cybersecurity Challenge (AI ATAC). This 
competition explored the capability for endpoint security products to incorporate machine learning 
(ML) and artificial intelligence (AI) models to detect and defeat indicators of compromise from 
various advanced malware strains. FireEye’s Al-based product, MalwareGuard, emerged as the 
winner in the competition. 


The total staff strength of the company globally was approximately 3,200 employees - comprising 
more than 700 highly experienced threat researchers, platform engineers, malware analysts, 
intelligence analysts and investigators.* FireEye had deployed more than 17 million virtual machine 
sensors globally and blocked between 50,000 to 70,000 confirmed malicious events per hour. As part 
of its threat intelligence and advanced practices efforts, it constantly monitored the activities of over 
40 advanced persistent threat (APT) groups and 10 financial (FIN) threat groups.° It also tracked 
nearly 2,000 uncharacterised (as in identified, but not yet named) threat groups or clusters. 


Al 


Al-based systems had started to play a significant role in virtually every major industry, including 
cybersecurity.’ The adoption of AI had accelerated in recent years across a wide range of industries 
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for the following reasons: the availability of large data sets, the decreasing cost of computation at 
scale, the availability of specialised hardware (GPU’s) for performing neural network computations, 
and ongoing improvements in deep learning methods. Because of a new wave of enthusiasm for AI 
applications driven by increasing commercial usage of machine learning methods, the global AI 
market was expected to grow to US$191 billion by 2025.° 


AI methods had been used to support and partially automate cybersecurity monitoring and response 
for decades, going back to the wide-spread use of rule-based and other types of knowledge-based 
systems starting in the 1980s. AI for anomaly detection and pilot machine learning solutions for 
cybersecurity settings were implemented in the latter part of the 1990s.’ Around 2015, cybersecurity 
companies had started to aggressively pursue the use of data-driven machine learning (ML) methods 
to both support and automate the analysis of cybersecurity monitoring data for tasks such as detecting 
anomalies in data, identifying and defending against attacks, and sometimes even for pre-empting 
attacks.'° In a survey conducted by Oracle and KPMG, it was noted that 53% of companies surveyed 
had already implemented ML for cybersecurity purposes by 2019.!! 


The more recent wave of AI technology deployment (since about 2015) had proven to be increasingly 
effective in cybersecurity to detect suspicious activity, to pinpoint compromises and incidents within 
the network infrastructure, and to enable companies to respond to cyber threats at a faster pace. The 
clear benefits of AI in cybersecurity applications were the efficiency of using such tools to process 
and analyse vast amounts of data in a relatively short timeframe, to recognise patterns of interest, and 
to support predictions in the form of relevant classifications, categorisations, and pattern 
identification (refer to Exhibit 1 for Milestones in AI history). 


Al + Human at FireEye 


At FireEye, AI solutions, including machine-learning based methods, had been implemented for a 
variety of applications including malware detection and antivirus support, malicious PowerShell 
detection, tools for email monitoring and phishing attack detection, as well as a variety of tools to 
support internal staff doing security operation centre analysis, incident response, and reverse 
engineering (refer to Exhibit 2 for FireEye AI solutions). 


To guide its internal thinking on identifying scenarios in which a machine or a human expert would 
be the most effective approach to solve cybersecurity challenges, the firm had conceptualised an 
“automatability spectrum’, which took multiple factors into consideration to determine the degree of 
automatability of a task(refer to Exhibit 3 A for FireEye’s Automatibility spectrum assessment 
factors and 3B for FireEye’s Automatibility Spectrum). On the right side of the spectrum were tasks 
that were easier to automate with AI tools, and technology was the preferred method. On the left side 
of the spectrum were tasks that were difficult to fully automate with AI and required human 
intervention, with AI tools playing a supporting role.'° 


Based on automatability, FireEye viewed its AI application for malware detection as being towards 
the far right side of the spectrum. Tools like Strings Sifter'* were located towards the right side of 
the spectrum, but closer to the middle as these applications were used to support the human analyst. 
Examples of tasks that FireEye considered to be on the far left of the spectrum (those that required 
higher levels and greater degrees of human expertise) included understanding and responding to a 
novel attack and various aspects of threat attribution (analysis effort to attribute the identity of threat 
perpetrators). Threat attribution included the analysis and identification of whether an attacker was a 
new entity not previously encountered or an existing catalogued entity. Known entities included 
existing APT groups, FIN groups, or existing unclassified threat groups or clusters. For threat 
attribution tasks, ML methods could be used in powerful ways to support human effort. However, 
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such tasks were sufficiently complex, and it was preferable—if not outright necessary—for the 
analysis to be driven by a human with the ML model serving as a support vehicle. 


Using ML for Threat Attribution 


The Advanced Practices group at FireEye was a behind-the-scenes unit that supported advanced 
capability development and analysis for all revenue-generating product and service units within the 
company. Within the Advanced Practices group, there was a special elite team called the Adversarial 
Pursuit (AP). Both the AP team and its parent Advanced Practices group were headed by Steven 
Stone. 


Stone described the work of his team as “Hunting Big Evil.” !° There were three basic tasks that his 
team performed: analytic decisions, pursuing adversaries, and threat knowledge management. In 
2018, the AP team had deployed an internally developed ML tool called Atomicity to support its 
threat attribution work.'® The development of the Atomicity tool had been triggered by an 
operational need — processing large amounts of data to sift through information and decipher meaning 
from the data. FireEye had collected a considerable amount of telemetry data about the operating 
patterns of various attackers, details of commonly used malware, and other threat techniques used by 
hackers. This had been assimilated through hundreds of investigations and analyses performed for 
clients over prior years. The firm had also collected information on thousands of uncharacterised 
threat clusters (UNC), which were threat groups or clusters that were deemed to be distinct from 
already known and named entities (APTs or FINs) and also from other pre-existing UNCs.'” 


Previous to implementing the Atomicity tool, threat analysts at FireEye’s AP team were using 
manual-based methods as key parts of their analysis to decide whether or not to merge a particular 
UNC with an existing threat cluster.'* However, as the scale and scope of data to be considered had 
been increasing so rapidly in recent years, this had become a mammoth challenge.!” 


The need for the Atomicity model in threat cluster similarity analysis 


Stone had noted the hurdles his team were facing in processing the huge amount of data to perform 
threat cluster similarity analysis.2” A major challenge was understanding the degree of similarity 
across thousands of UNCs that were being tracked. Until 2018, the team had used a manual method 
to extract and compile information from various sources to compare UNCs. 


However, the AP team was already stretched with the amount of work it was doing. Moreover, the 
tedious and time-intensive task of manually deciding the most relevant candidates to compare against, 
and collating all of the support data to do the analysis, distracted their attention from the primary task 
which was to evaluate the evidence and make the decision on whether a UNC needed to be merged 
with an existing entity. 


When the AP team identified a UNC as being the same entity as a pre-existing one (with high 
probability), they would make the decision to merge the two groups together. The merger provided 
the team with additional “inherited” knowledge of threat capabilities, methods, behaviours, useful 
background knowledge and helped them to anticipate what would happen next. They could also see 
how the existing threat cluster had been evolving. Stone explained, “When we had identified that it 
was warranted to merge a new unidentified threat cluster with a pre-existing entity, we became much 
more familiar with the situation. This guided us in what to look for, and in determining what response 
and remediation steps to take.” 
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Decision to adopt ML 


Stone brainstormed with his team to evaluate if an ML solution could help them offload some of the 
repetitive, tedious tasks performed in threat cluster similarity analysis. The decision to adopt an ML 
tool to resolve the UNC categorisation challenge required several considerations. Firstly, it was 
dependent on the data — whether the data was sustainable, how quickly it changed, and whether the 
dataset was in the right state for it to work well with the ML model. If the data was not in the right 
state, it would need to be cleaned and sanitised for it to work well with the model. 


Secondly, from the cost perspective, a cost versus value analysis was required to evaluate if the 
implementation of the ML model would provide enough value, or was it better to have a different 
type of support tool like a rule-based solution for the task. Thirdly, the adaptability of the model, and 
how often it would need to be trained for it to work well with changing data was also a consideration. 


Considerations for the model and development approach 


The AP team decided to build an ML tool after several brainstorming sessions. Stone approached 
FireEye's Data Science team to get their help in building a model that could automatically generate 
analysis outputs that his SME's could use to improve the efficiency as well as the quality of their 
similarity comparisons. Stone believed that the data required to do this analysis was already existing 
within the company. 


FireEye network, endpoint, and email security controls deployed across the globe were built to allow 
telemetry to flow back through different channels.*! The firm had already worked through the 
technical, infrastructure and data management challenges of centralising this vast amount of 
telemetry information, such that it could be analysed, standardised, automated and scaled. 
Centralisation of data had enabled the AP team to have visibility of all threats against all clients. This 
was an entirely different and enhanced capability than only seeing a single threat across multiple 
clients, or seeing multiple threats for a single client. 


The challenge was to find a method to analyse the centralised information and create a better way of 
comparing arecently identified UNC against all other threat entities that were being tracked. Towards 
this end, devising a method required a very high degree of cross-functional collaboration across 
FireEye’s data science experts, cyber threat analysis domain experts, and IT infrastructure and 
platform engineering experts. This meant that building the model would require time commitment, 
contribution and collaboration of teams of people across all three of these areas of expertise. 


A team was formed consisting of experts from each area. Agile methodologies were implemented to 
develop the application as a collaborative effort. Ledzian had noted that many organisations struggled 
to create such collaborations. The tendency was to create ‘silos’ of expertise and let the data scientists 
and IT teams build applications based on Big Data, and then hand those off to the cybersecurity teams 
in the operations centre to test, which would often result in finger-pointing and unsuccessful 
implementations. 


FireEye had implemented processes to enable different teams with a mix of skills and perspectives 
to work collaboratively. It had also implemented decision processes to enable managers and 
department heads to approach Data Science teams and build solutions using an agile approach. 
Moreover, AI solution building was incorporated as part of the organisation’s mission. People at all 
levels were encouraged to trust the algorithms’ suggestions, and at the same time to actively highlight 
cases where they had serious doubts about an algorithm’s output. In such cases of doubt, users of the 
outputs were encouraged to raise requests for analysing the variation and retrain the model. 
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Teams were also aware that the solutions would be built iteratively, and would not have all the 
functionality or the desired degree of accuracy when they were initially deployed. A test-and-learn- 
and-improve mentality and deployment approach was expected. Since data scientists, cybersecurity 
subject matter experts and IT engineers worked in close collaboration, even minor issues in the 
solutions would get highlighted immediately and get fixed before becoming costly problems. The 
agile approach also ensured that the tools were designed quickly. 


Goals of the model 


The goal of building the model was to create a more intelligent and automated tool to help 
systematically and objectively make the comparison of how similar one UNC was to all other UNCs, 
as well as to the entities in other attribution categories. Another goal was to do this more efficiently, 
given that the AP team was always dealing with new UNCs and new incoming telemetry on existing 
UNCs. They needed to make the entire process of making these determinations of similarity more 
systematic and more data-driven. 


Clustering Approach 


To begin building the framework for the ML model, the cyber threat attribution analysts identified 
about 50 important dimensions of cyber-threat (refer to Exhibit 4: FireEye Clustering Approach). 
They worked with the Data Science team to update and reorganise the data set on all the UNCs they 
had been tracking in order to describe each UNC in terms of these dimensions. The analytic challenge 
was to use these dozens of dimensions across thousands of threat entities being observed and tracked 
over extended time periods and to construct an overall composite score resulting in a single- 
dimensional metric of similarity. Machine learning techniques from information retrieval were used 
to represent the different dimensions as a vector of numbers, then the cosine similarity was used to 
measure how close any two vectors (1.e., threat actors) were to one another (refer to Exhibit 5 for a 
summary of the Cosine Similarity Algorithm). 


Evaluating the model 


To evaluate and validate the Atomicity tool, the team applied it on historical information to look at 
all the previous decisions FireEye expert threat attribution analysts had made for merging UNCs. 
These prior decisions were based on human expert determination that two separate cyber threat 
entities were actually the same entity. The evaluation helped in two ways. First, it helped the Data 
Science team to refine and improve the underlying model. Second, it also helped the threat attribution 
analysts to sharpen and deepen their own understanding of their expert reasoning as they had to 
meticulously compare their prior decisions with the results using the new tool. 


The review, validation and revision of the Atomicity ML tool became an ongoing effort and was built 
into the firm’s work process. Every single instance of machine-generated output and human SME 
assessments on the same topic were carefully compared and assessed over an extended period of time, 
and insights gathered from this comparison process was used to further fine-tune and train the model. 
The interaction between the ML systems and human SMEs on an ongoing basis also helped foster 
co-learning and knowledge building. The deep embedding of the tool into a larger work process, and 
iterative efforts to keep reviewing and refining model performance over an extended period of time, 
were not only applied to the situation of the Atomicity tool but also to most other ML tools developed 
by FireEye. 


Model training challenges 
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During the training phase of model building, the team faced a significant challenge - the lack of data 
labelled in the appropriate way. As data from most investigations had been manually entered, it had 
the propensity to create noise. Moreover, the breadth of the data was limited by the investigations 
conducted by the firm. Another issue was that the data of target groups had changed over time; for 
example, what attackers were doing in the present could be very different from what they had done 
in the past. What attackers had done lately was usually more important than what they had done a 
few years ago, and this was an important consideration for model training. 


To test the similarity scores, the team needed a ready compilation of test data that captured as many 
scenarios as possible. One option was to use existing data, but with a limited number of labelled 
samples of groups (similar and dissimilar) without enough information. However, when trying to 
assign custom weightings for each topic, the team realised that they needed many labelled samples 
of both similar and dissimilar groups and then fit a regression model to accurately classify them. 
Unfortunately, in the existing data, only a tiny fraction of all potential pairings had been analysed. 
So the existing data was inadequate for regression testing due to insufficient labelled samples. 


To solve the problem of the lack of labelled data, two of the data scientists in the team came up with 
the suggestion of synthetically creating thousands of ‘fake’ but statistically useful clusters by 
randomly sampling from well-established, previously known threat groups. This approach allowed 
the solution to label any two samples that came from the same group as definitely similar and any 
two from separate groups as dissimilar. This approach of synthesising data samples with the 
appropriate statistical properties that could be accurately labelled enabled them to augment whatever 
existing data they had, and they were able to generate a sufficiently large labelled dataset needed to 
train the ML model. The team developed and tested a multi-variate linear regression model that had 
previously been used in natural language processing to assess the similarity of one document to 
another within a large corpus of documents. The synthetically created clusters also allowed testing 
of various iterations of the model, and this testing was used to benchmark and evaluate performance 
as the model was iteratively updated and improved. 


Emphasis on explainability 


At the organisation level, FireEye tried to ensure that all its SMEs and Data Science experts were 
able to understand how the ML-based models worked and generated outputs by organising training 
sessions for all team members. This was to ensure that threat analysts could trust, understand and 
explain the output of the ML system for further analysis. When the analyst could understand and 
explain the similarity scores automatically generated by the system, they felt more comfortable and 
confident in using this information in conjunction with their own expertise to make the actual 
decisions related to whether or not to merge entities. In instances where the AP team could not 
understand the output, they would take this feedback to the Data Science team, and find a way to 
make the model and output understandable or in some scenarios, where appropriate retrain the model 
(refer to Exhibit 6A for FireEye’s ML approach and 6B for Model Model Usage). 


Application of the Model 


The Atomicity model had provided Stone and his team a framework to calculate and discover 
similarities between groups of activities, and then develop investigative leads for follow-on analysis. 
For every new threat cluster group identified, the tool automatically generated a list of 10 most similar 
existing threat cluster groups based on their similarity score. The analysts could use this output, 
together with their human expertise, to probe why they thought the two groups (the new threat cluster 
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versus any of the 10 items on the list of most similar existing clusters) were the same or not, with 
deeper attention and higher priority on candidates that had a higher similarity measure. 


The experience of the AP team working with the Atomicity tool was unanimously positive. This 
motivated the organisation to include the tool in several of its products and services that required 
assessment of threats based on similarity. The implementation of the Atomicity ML tool also changed 
the way teams at FireEye worked across the organisation and with other departments. 


Soon after deployment, the tool had successfully helped identify similarities between a series of 
intrusions within the engineering industry and associate them with a threat group called APT33. This 
discovery was followed by comparative analysis to verify the findings (refer to Exhibit 7 for Role 
of Atomicity in Identifying APT33 attacks). 


For Steven Stone’s Advance Practices group, their objective was to implement ML solutions whereby 
they could clearly justify the expense of building and maintaining the tool. Towards this end, they 
focused on use cases and scenarios in which they had high volumes of recurring work (the need to 
compare the similarity of threat clusters) and for which there was a massive amount of data available. 
They purposely stayed away from considering application scenarios that were more akin to 
“searching for the needle in the haystack”, i.e., searching for that rare and not well-understood event 
for which they had very little data or experience. 


Benefits of the model 


There were also consequential benefits from the insights gathered from using the Atomicity tool, 
including substantial direct efficiency and productivity benefits. For a UNC being analysed, the 
expert analysts doing adversarial pursuit no longer had to spend substantial amounts of time 
pondering which other UNC’s and identified threat groups (APTs, FINs) needed to be considered for 
similarity comparisons. With Atomicity, their starting point was an automatically generated 
prioritised list of top ten candidates to consider. Not only was this method much faster, it was also 
better. The similarity scores were all computed in a known and systematic way and made use of a 
much larger data set that always considered all of the available information over multiple time periods. 
Human analysts could not have made use of all of this data in such a systematic and comprehensive 
way. With a prioritised list of candidates available at the outset, the analysts were able to focus their 
time on applying their expertise to merger of an UNC and threat attribution. 


Additionally, the team could now visualise threat clusters with a wider lens. Using the tool, the team 
could also gather macro-level insights about the current and changing properties of the entire 
“universe” of all the threat clusters they were looking at. In other words, they could literally visualise 
the extent to which all of these threat clusters were moving toward or away from one another in terms 
of the degree of similarity, using the aggregate similarity scores for each cluster. Also, by looking 
across all entities, comparing changes over time in the overall composite score for each entity, and 
also considering the detailed sub-components used to compute the composite score, they could track 
and visualise the spread of specific cyber-attack techniques across the “universe” of UNCs. They 
could track trends of an attack technique (like whether a particular threat cluster was becoming an 
“arms supplier”) and visualise and measure the spread of that technique across the “universe” of 
clusters. 


The Human + Al advantage 
FireEye’s human + AI approach to cybersecurity was helpful in several ways, as Ledzian elaborated, 


8/19 


SMU-20-0042 Cybersecurity at FireEye: Human + AI 


Cyber Security analytics at FireEye comprises both human expertise and automated AI tools. In 
cybersecurity, the biggest challenge is the skills gap. We do not have enough human analysts to 
do everything that we want to do. AI/ML tools help analysts to take away some of those very 
tedious manual tasks that they are required to do, so they are freed up to do higher-order tasks 
that require expert human decision-making skills. The tools also help prioritise what human 
analysts should focus on. 


Achieving good ML outcomes was truly based on the sanctity of the data, including the quality, type, 
and amount of test data available for training the model. Ledzian shared, 


ML is one of the useful tools in an array of products. We have quite a wealth of data; in ML data 
is very important, but available datasets have to be kept up-to-date. Making sure that the right 
data is fed into the model is important, as the quality of the analysis critically depends on the 
quality of data. 


Many aspects of cybersecurity were open-ended, uncertain and always changing. Analysts were 
sometimes not sure of what threat actors were doing, or did not know about the details of the 
techniques they were using for designing their attacks. Sometimes, victims of cyber threats and 
attacks were not comfortable sharing their data, even with a specialised firm like FireEye that was 
there to help them deal with these issues. For these reasons, gathering data sets needed for threat 
attribution was challenging. While there was always a lot of data available, it was not necessarily the 
right type of data needed for the specifics of threat attribution, or for training the ML model for 
making similarity comparisons. 


The intensive interaction between the data scientists and SME’s in threat attribution did not end once 
the ML model was trained, piloted and implemented. ML model deployment was an iterative process, 
and the model needed to be constantly updated to improve performance and to remain relevant. A 
key reason for this was that cyber threats constantly changed and models would need to be constantly 
revised and retrained to cope with the changes. Ongoing post-deployment ML model maintenance 
had been known to create a large amount of ‘technical debt’ in a number of organisations,” and 
FireEye had experienced such challenges as well. 


With ML models often making use of nonlinear dependencies between input data, small changes in 
data could potentially create cascading effects on a model’s accuracy. This could have larger than 
anticipated impact on downstream systems that made use of the predictive capabilities of these 
models. In cybersecurity settings, due to its always evolving nature of the threat landscape, building 
and supporting ML models faced the inherent conflict between the need to regularly update models 
over time to adjust to new data on new threats, versus the risk of unpredictable and undesirable things 
happening, including unexpected outcomes (like false alerts or missing malicious activity), as a result 
of having changed the models. For many companies in a wide range of industries, including 
cybersecurity, these types of changes created technical debt. 


For a cybersecurity firm, the ongoing need for model updates due to the regular appearance of new 
vulnerabilities always led to the risk that the changes could have negative interactions with the 
cybersecurity infrastructure and related workflow, which threatened to reduce an ML model’s value. 
FireEye had chosen to invest a great deal of managerial attention and technical expertise to balance 
this inherent trade-off between constantly making changes to ML models to incorporate updates and 
managing the risks associated with making those changes.” 


A key aspect of the human + AI approach adopted by the FireEye AP team was that it tried to 
consciously keep such technical debt in check, by moderating the extent of the dependency on the 


9/19 


SMU-20-0042 Cybersecurity at FireEye: Human + AI 


model and having a layer of human support and intervention (much like a school teacher keeping an 
eye on a student’s learning), and intervening to correct the understanding wherever applicable. 


The evolving role of human and machine intelligence in an ever changing cyber threat 
landscape 


Despite the proliferation of AI solutions in cybersecurity, cyber threat was still a growing 
phenomenon. Not surprisingly, cyber attackers also used AI to hack through organisational networks 
and systems. In March 2019, the CEO of a UK-based energy firm had thought he was speaking to 
his boss and sanctioned an urgent transfer of US$243,000 to what he believed to be the account of a 
new Hungarian supplier. Within hours, the money had passed through a network of accounts and 
subsequently moved to Mexico to be distributed to other locations. The criminals 


had used AI tools to mimic the voice of the boss in the phone call, and that one Al-enabled 
conversation had allowed them to bypass layers of cybersecurity controls. Their success illustrated 
how the use of powerful developing technologies could change the landscape of cybercrime for 
attackers as well as for defenders.” 


The rapidly growing use of new technologies, the Internet of Things (IoT) in particular, posed more 
hurdles, as simple security software and hardware implemented in IoT devices provided infiltrators 
with loopholes and with more attack surfaces —i.e., more opportunities to target unsuspecting victims 
at even larger scale and higher success rates. There were also other technologies like behavioural 
analytics, blockchain and embedded hardware authentication technologies that were increasingly 
being used by both attackers and defenders.*> While less sophisticated attackers were likely to 
continue using older techniques for long periods of time, more sophisticated attackers were regularly 
innovating and creating new methods of attacking that had not been seen before. It was an ever 
changing cybersecurity threat landscape. 


At the same time, AI had the promise of being a boon for cybersecurity protection. The FireEye 
application of AI to support the AP team was an example of a symbiotic process of human knowledge 
initialising the ML platform solution verification, and then humans learning and sharpening their 
judgement based on the ML model insights. Through the combination of the feedback from expert 
human analysts with the influx of a constant stream of new data, this symbiotic learning process 
could generate continually improving solutions to deal with continuously evolving innovative cyber- 
attacks. Moreover, this exercise could refine the model’s predictive capabilities (like probability- 
based ratings of similarity in the Atomicity model) leading to performance improvement. 


ML approaches that implemented deep learning techniques had shown impressive results in 
automatically learning feature modality of images, text and speech.”° Such methods did not require 
human expertise to pre-define features, as the multi-layered representation of the input data that 
emerged from the deep learning model (once all of the parameters had been properly tuned) ended 
up automatically creating useful features for discrimination. This type of feature learning was 
sometimes referred to as representation learning. There had been recent examples of cybersecurity 
R&D showing how features were automatically extracted from unlabelled data by training a neural 
network on a secondary, supervised learning task without the involvement of human expertise.’ 


As FireEye and other cybersecurity companies actively using ML continued to enhance their 
capabilities for developing, deploying and supporting these models, and as they gained more 
experience with deploying and using deep learning models, it raised the question of what role humans 
would play in cybersecurity threat analysis and response tasks in years to come. What type of human 
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domain expertise would be important to cultivate and retain going forward with the increased usage 
of ML based support tools and automation systems? 


This consideration was not so simple. FireEye had deep experience and strong evidence to 
demonstrate the utility and expanding applicability of ML models. Given this trend, would this lead 
to all or most of cybersecurity analysis and response tasks becoming fully automated in any 
reasonable future timeframe? Even for some key aspects of analysis and response tasks on the right 
side of “automatability spectrum” mentioned earlier, including the far right side, where the firm had 
already demonstrated it was possible to achieve higher degrees of automation, it did not foresee that 
ML would be able to take over and fully automate the work in the near term or medium term. And it 
was even harder to fully automate all analysis and response tasks on the left side of the spectrum. 


The fact that the cyber threat landscape was constantly and rapidly shifting meant that there would 
continue to be a need for human capability, increasingly working together with ML 


support and automation systems, to make sense of and respond to unknown situations. Even with 
increasing degrees of automation enabled by increasingly sophisticated ML and AI systems, the need 
for human analysts, as well as data scientists, with deep cybersecurity threat analysis expertise was 
expected to increase, not decrease. Another contributing factor to this expectation was that the 
limitations and brittleness of even the most advanced deep learning systems were well known.” 


The roles performed by humans working in cybersecurity, including the roles of expert threat analysts, 
would certainly evolve and change as FireEye and other cybersecurity companies continued to make 
progress with using ML and other types of AI methods, and as these technologies became more 
integrated into their work processes. However, the people who were subject matter experts in these 
domains, as well as the people who were building and improving the analytics models, were not 
disappearing from the scene. In fact, they were busier than ever. 


Staying Ahead 


Cybersecurity was like a never-ending competitive ‘cat and mouse game’ between the legitimate 
protectors and the criminal attackers. And sometimes the protectors became the attackers. It was an 
ever-evolving game space, simultaneously encompassing the equivalent of familiar game 
interactions, as well as novel and previously unseen interactions, and also the emergence of entirely 
new and not yet understood games, all intermixed. What was clear was that the entities (whether they 
were the “good guys” or the “bad guys”’) that could move at greater speed and larger scale to make 
use of data to amplify their learning and intelligence would have the advantage, and would ultimately 
prevail. 


Reflecting on the bewildering complexity, Ledzian wondered: Could FireEye create a winning source 
of tools and models to keep up with—or better yet—keep ahead of the attacking opponent every time? 
Did AI really hold the key to the future of cyber-security? 


FireEye obviously wanted to be on the winning side of the competition. But, what could it do to get 
there? Could an all-automated, AI machine-driven approach be the answer? Or was its existing 
approach, a symbiosis of human capability and Al-enabled machine support, the right way forward? 
Or possibly the only viable way forward? Or, were both these approaches suitable simultaneously - 
across every point on the automatability spectrum and in between, depending on the nature of the 
game the protectors were encountering with the attackers? 
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Could FireEye raise the bar and develop predictive AI tools that could foretell what a threat entity 


would do in the future? Could FireEye build a Predictive Analytics solution to predict the next cyber- 
attack? 
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EXHIBIT 1: MILESTONES IN Al HISTORY 


2010 


Source: Company Data 


EXHIBIT 2: FIREEYE Al SOLUTIONS 


Malware Guard Strings Sifter Atomicity Suspicious PowerShell 
Detection Model 

One of the solutions Another solution was an ML model | Threat The ML model evaluated 
introduced was an called Strings Sifter, which was Attribution PowerShell commands 
additional ML layer in its | used to automatically find relevant | Similarity using methods from the 
endpoint security product | information from malware code analysis tool NLP domain to identify 
called MalwareGuard — that human analysts could use to suspicious command line 
which allowed customers | understand what the malware was arguments by implicitly 

to detect and prevent trying to do. Strings Sifter learning non-linear 
malware from executing | identified and prioritised strings combinations of certain 
according to their relevance for patterns — making it 
malware analysis. The tool was difficult for adversaries to 
used across multiple FireEye bypass the checks. Threat 
products including its email analysts then evaluated the 
security product and endpoint inferences from the tool and 
security product. applied their subject matter 
expertise (SME) to decide 
on requisite actions to be 
taken. 


Source: Company Data 
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EXHIBIT 3A: FIREEYE’S AUTOMATIBILITY SPECTRUM ASSESMENT FACTORS 


¢ Data: was there a sufficient amount of data available to capture the underlying characteristics of 
interest for the targeted cybersecurity task. 


¢ Labels: was there sufficient labelled data that could be used to distinguish the entities in all of the 
relevant categories of interest from one another. 


* Velocity: how quickly the entities within a labelled category were changing their characteristics, 
either intentionally to avoid detection, or due to the ongoing changes related to that area. 


* Knowledge: 1) did data scientists and domain experts have the knowledge of how to modify the 
available data to put in in a form that makes it easier for the machine to make use of it for the ML 
model building effort, and 11) were there necessary expertise in the specific AI and ML methods 
needed to build the required models. 


¢ Maintenance: the complexity and effort of maintaining i) all related code, including the ML 
application code, and all related system code and infrastructure support code, and ii) all related data, 
including data for the ongoing training/retraining efforts needed for the ML model, and the data 
provenance. 


Source: Company Data 


EXHIBIT 3B: FIREEYE’S AUTOMATABILITY SPECTRUM 


HARD TO AUTOMATE EASY TO AUTOMATE 


Human Prefered ine Preferred 


uy : Alert Tioge Molwore 
identificatior Response poi 2 Classification 


Source: Company Data 
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EXHIBIT 4: FIREEYE CLUSTERING APPROACH 


FireEye uses the clustering method to bunch forensically related artefacts into ‘clusters’. Information 
regarding a UNC in the FireEye database is stored in a summary ‘document’, which is further broken 
into sections referred to as ‘topics’. A cluster is represented as a collection of topics. Within each 
topic, data is further segregated into ‘terms’ which have associated counts. The ‘counts’ represent 
the number of times the ‘topic’ is used/occurred. For example, if ‘malware’ is a topic, then specific 
types of malware used would be analogous to ‘term’ and the viewed number of occurrences of the 
specific element would represent the frequency count of the term. 


The clustering approach of data is similar in a way to how movies or books are grouped on online 
platforms like Netflix and Goodreads. The clustering approach allows the model to evaluate 
similarities at scale. Each topic is modelled individually such that each topic can produce its own 
measure of similarity between groups, which can then be aggregated to create a holistic similarity 
metric. 


FAM Q 
Group : Document U pra 
; {UNC333 


Known Aliases 
- fer 4.( 


Category : Topic 


31 Bear 


Malware 
WEETFACE: 26 
FORKNIFE: 12 


| [ Target ndustries [ infrastructure 
# of Observations: Term v/Mil: 6 TK: 7 


Crafting: 14 cA: 5 
Count rele 3G:19 


UNC333 | UNCS501 


Guccifer 4.0 Spear Phishing:11 yGorilla Spear Phishing:15 
Casual Bear Powershell: 13 ATOD SQL Injection:12 
DLL Side-Loading: 3 SuperSoft LSticky Keys:8 _ 


Malw T ountri¢ Malware | fa arget Countr 


re yrget ¢ 
SWEETFACE: 2¢ Sogu:7 [USA:5 
FORKNIFE: 12 UK: XtremeRAT:13 Macedonia:1 
| | Germany:24 Iceland:24 


MyDoom:1¢ 


Target Industries nfrastructure araet Industries nfrastructu 


Gov/Mil: € TK: 8 FR:8 
Crafting: 14 CA: 2 = 
Aviation: 7 UG: 23 Nuclear Power:4 } } 


In the example above for UNC333 and UNCS501, spear phishing is a common term, but then spear 
phishing is acommon term amongst many groups, hence it may not be a good measure of Similarity. 
The term USA will also not be a good measure as many groups target USA. Crafting could probably 
be a good measure of Similarity as it is a particular industry. 


Source: Matt Berninger, “Going ATOMIC: Clustering and Associating Attacker Activity at Scale”, FireEye, 
Threat Research, March 12, 2019, https://www.FireEye.com/blog/threat-research/2019/03/clustering-and- 


associating-attacker-activity-at-scale.html , accessed May 2020. 
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EXHIBIT 5: COSINE SIMILARITY ALGORITHM 


Cosine Similarity(X,Y) = cos(@) 


UNC201 UNCS99 
1.0 
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Cosine Similarity is a mathematical method to judge the orientation of two input vectors rather than 
their magnitude. Similarity is assigned a value between 0 to 1, and if two objects have lower similarity 
score, the distance between them is very small and they are tagged not similar. If they have a larger 
value of similarity, they are termed similar. Cosine Similarity is useful when there are sparse vectors 
or attributes (as in the case of UNCs). The method finds the angle of separation between two input 
vectors, depending on whether they are in the same direction or otherwise.' Such cosine value 
determination is commonly used by recommender systems for e-commerce. For example, the 
recommender system of Netflix shows viewers a list of similar movies using this model - hence a 
person who has seen ‘Avengers’ will not see ‘Gone with the Wind’ in their recommended watch list. 


Cosine Similarity is a popular tool in cybersecurity, with multiple applications. For instance, it is 
used to assess the security of passwords by matching an input to existing data to allow users on 
Internet platforms to select a secure password for their username. It is also used by biometric 
recognition applications, where images were matched against existing images to determine 
authenticity. The Cosine Similarity method limits the impact that any particular term or category can 
have on the total similarity. This is helpful in ensuring that the model is judging things fairly. For 
example, in a group of books, the term ‘The’ was likely to be used quite frequently and hence it was 
not fair to assign it a high weightage to measure similarity. However, the word ‘spaceship’ would 
probably have less chances of occurrence in every book and could be assigned a higher weightage to 
help group certain books. 


Source: Matt Berninger, “Going ATOMIC: Clustering and Associating Attacker Activity at Scale”, FireKye, 
Threat Research, March 12, 2019, https://www.FireEye.com/blog/threat-research/2019/03/clustering-and- 


associating-attacker-activity-at-scale.html , accessed May 2020. 
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EXHIBIT 6A: FIREEYE’S ML APPROACH 


ML Applications: Start to Finish 


TRAINING APP PREDICT APP 


Train data and 
create model 


Source: Company Data 


EXHIBIT 6B: FIREEYE ML MODEL USAGE 


Source: Company Data 
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EXHIBIT 7: ROLE OF ATOMICITY IN IDENTIFYING APT33 ATTACKS 


The Atomicity tool had helped the AP team identify similarities between a series of intrusions within 
the engineering industry and associate them with a threat group called APT33. This discovery was 
followed by a comparative analysis by threat analysts to verify the findings. FireEye analysed all 
available organic information from numerous intrusions and all known APT33 activity. The 
Advanced Practices team found with medium confidence, that two specific early-phase intrusions 
were the work of a single group. It then reconstructed an operational timeline based on confirmed 
APT33 activity observed in the previous year and determined that there were circumstantial overlaps 
that indicated remarkable similarities of the recent activity leading to the assessment that the 
intrusions were conducted by APT33. 


APT33 was an identified group of malware attackers with suspected attribution to Iran, which 
targeted the aerospace and energy industries. To keep an eye on groups like APT33 and their 
attempted attacks on its clients, the team at FireEye constantly monitored the activities of such groups. 
APT33 had targeted organisations across geographies including offices in U.S., Saudi Arabia and 
South Korea. In December 2018, public reports had indicated links of malware SHAMOON attacks 
earlier that month with the APT33 group. APT33 had sent spear-phishing emails to employees who 
worked in the aviation industry; the emails included recruitment themed lures and contained 
malicious links to HTML files. Using Atomicity, the team at FireEye had been able to correlate the 
intrusion and observe possible relation between APT33 and the SHAMOON attacks. 


Source: Geoff Ackerman, Rick Cole, Andrew Thompson, Alex Orleans, Nick Carr, OVERRULED: Containing a 
Potentially Destructive Adversary, FireEye, Threat Research, December 21, 2018, 
https://www.FireEye.com/blog/threat-research/2018/12/overruled-containing-a-potentially-destructive- 


adversary.html, accessed May 2020. 
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